{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ f'(x) $$ a function\n",
    "\n",
    "$$ f'(a) $$ a number when $f$ is a function of one variable"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Derivative math"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{df}{dx}(a) = \n",
    "\\lim_{\\Delta \\to 0} \\frac{{f \\left( {a + \\Delta } \\right) - f\\left( a - \\Delta \\right)}}{2 * \\Delta } $$\n",
    "\n",
    "$$ \\frac = \n",
    "\\lim_{\\Delta \\to 0} \\frac{{f \\left( {a + \\Delta } \\right) - f\\left( a - \\Delta \\right)}}{2 * \\Delta } $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ f_2(f_1(x)) = y $$\n",
    "\n",
    "$$ f_1(x) = u $$\n",
    "\n",
    "$$ \\frac{df_2}{dx}(x) = \\frac{df_2}{du}(f_1(x)) * \\frac{df_1}{dx}(x) $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{df_3}{dx}(x) = \\frac{df_3}{dv}(f_2(f_1(x))) * \\frac{df_2}{du}(f_1(x)) * \\frac{df_1}{dx}(x) $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{df}{dx}\\bigr\\rvert_{x=a} = \n",
    "\\lim_{\\Delta \\to 0} \\frac{{f \\left( {a + \\Delta } \\right) - f\\left( a - \\Delta \\right)}}{2 * \\Delta } $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Function with multiple inputs example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ f(x, y) = s$$\n",
    "\n",
    "$$ a = a(x, y) = x + y $$\n",
    "\n",
    "$$ s = \\sigma(a) $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ f(x, y) = s(a(x, y)) $$\n",
    "\n",
    "$$ \\frac{\\partial f}{\\partial x} = \\frac{\\partial \\sigma}{\\partial u}(a(x, y)) * \\frac{\\partial a}{\\partial x}((x, y)) \\\\ = \\frac{\\partial \\sigma}{\\partial u}(x + y) * \\frac{\\partial a}{\\partial x}((x, y))$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Matrix multiplication example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ X = \\begin{bmatrix}\n",
    "x_{11} & x_{12} & x_{13} \\\\\n",
    "x_{21} & x_{22} & x_{23} \\\\\n",
    "x_{31} & x_{32} & x_{33}\n",
    "\\end{bmatrix} $$\n",
    "\n",
    "$$ W = \\begin{bmatrix}\n",
    "w_{11} & w_{12} \\\\\n",
    "w_{21} & w_{22} \\\\\n",
    "w_{31} & w_{32} \\\\\n",
    "\\end{bmatrix} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\nu(X, W) = X * W = \\begin{bmatrix}\n",
    "x_{11} * w_{11} + x_{12} * w_{21} + x_{13} * w_{31} &\n",
    "x_{11} * w_{12} + x_{12} * w_{22} + x_{13} * w_{32}\n",
    "\\\\\n",
    "x_{21} * w_{11} + x_{22} * w_{21} + x_{23} * w_{31} &\n",
    "x_{21} * w_{12} + x_{22} * w_{22} + x_{23} * w_{32}\n",
    "\\\\\n",
    "x_{31} * w_{11} + x_{32} * w_{21} + x_{33} * w_{31} &\n",
    "x_{31} * w_{12} + x_{32} * w_{22} + x_{33} * w_{32}\n",
    "\\end{bmatrix} = \n",
    "\\begin{bmatrix}\n",
    "XW_{11} &\n",
    "XW_{12}\n",
    "\\\\\n",
    "XW_{21} &\n",
    "XW_{22}\n",
    "\\\\\n",
    "XW_{31} &\n",
    "XW_{32}\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\sigma(XW_{11}) = \\sigma(x_{11} * w_{11} + x_{12} * w_{21} + x_{13} * w_{31}) \\\\\n",
    "\\sigma(XW_{12}) = \\sigma(x_{11} * w_{12} + x_{12} * w_{22} + x_{13} * w_{32}) \\\\\n",
    "\\cdots\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\sigma(X * W) = \\begin{bmatrix}\n",
    "\\sigma(x_{11} * w_{11} + x_{12} * w_{21} + x_{13} * w_{31}) &\n",
    "\\sigma(x_{11} * w_{12} + x_{12} * w_{22} + x_{13} * w_{32})\n",
    "\\\\\n",
    "\\sigma(x_{21} * w_{11} + x_{22} * w_{21} + x_{23} * w_{31}) &\n",
    "\\sigma(x_{21} * w_{12} + x_{22} * w_{22} + x_{23} * w_{32})\n",
    "\\\\\n",
    "\\sigma(x_{31} * w_{11} + x_{32} * w_{21} + x_{33} * w_{31}) &\n",
    "\\sigma(x_{31} * w_{12} + x_{32} * w_{22} + x_{33} * w_{32})\n",
    "\\end{bmatrix} = \n",
    "\\begin{bmatrix}\n",
    "\\sigma(XW_{11}) & \\sigma(XW_{12})\\\\\n",
    "\\sigma(XW_{21}) & \\sigma(XW_{22})\\\\\n",
    "\\sigma(XW_{31}) & \\sigma(XW_{32})\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ L = \\Lambda(\\sigma(X * W)) = \\Lambda(\\begin{bmatrix}\n",
    "\\sigma(XW_{11}) & \\sigma(XW_{12})\\\\\n",
    "\\sigma(XW_{21}) & \\sigma(XW_{22})\\\\\n",
    "\\sigma(XW_{31}) & \\sigma(XW_{32})\n",
    "\\end{bmatrix}) =  \\sigma(XW_{11}) + \\sigma(XW_{12}) + \\sigma(XW_{21}) + \\sigma(XW_{22}) + \\sigma(XW_{31}) + \\sigma(XW_{32})\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\Lambda}{\\partial u}(X) = \n",
    "\\begin{bmatrix}\n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{11}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{12}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{13}) \\\\\n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{21}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{22}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{23}) \\\\\n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{31}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{32}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{33}) \n",
    "\\end{bmatrix} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ S = \\begin{bmatrix}\n",
    "s_{11} & s_{12} \\\\\n",
    "s_{21} & s_{22} \\\\\n",
    "s_{31} & s_{32} \\\\\n",
    "\\end{bmatrix} $$\n",
    "\n",
    "$$ \\frac{\\partial \\Lambda}{\\partial u}(S) = \\begin{bmatrix}\n",
    "1 & 1\\\\\n",
    "1 & 1\\\\\n",
    "1 & 1\n",
    "\\end{bmatrix}) $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\sigma}{\\partial u}(N) = \\begin{bmatrix}\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{12}) \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{21}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{22}) \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{31}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{32})\n",
    "\\end{bmatrix} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$ L = \\Lambda(\\sigma(\\nu(X, W))) $"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$ \\frac{\\partial \\Lambda}{\\partial u}(X) = \n",
    "\\frac{\\partial \\nu}{\\partial X}(X, W) *\n",
    "\\frac{\\partial \\sigma}{\\partial u}(N) *\n",
    "\\frac{\\partial \\Lambda}{\\partial u}(S) $"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\Lambda}{\\partial u}(N) = \\frac{\\partial \\Lambda}{\\partial u}(N) $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\Lambda}{\\partial X}(X) = \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(S) * ? = \n",
    "\\begin{bmatrix}\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{12}) \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{21}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{22}) \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{31}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{32})\n",
    "\\end{bmatrix} * ? $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\sigma(XW_{11}) = \\sigma(x_{11} * w_{11} + x_{12} * w_{21} + x_{13} * w_{31}) $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\Lambda}{\\partial X}(X) = \n",
    "\\begin{bmatrix}\n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{11}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{12}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{13}) \\\\\n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{21}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{22}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{23}) \\\\\n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{31}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{32}) & \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(x_{33}) \n",
    "\\end{bmatrix} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\sigma(XW_{11}) = \\sigma(x_{11} * w_{11} + x_{12} * w_{21} + x_{13} * w_{31}) $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\sigma(XW_{11})}{\\partial X} = \\begin{bmatrix}\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) * w_{11} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) * w_{21} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) * w_{31} \\\\\n",
    "0 &\n",
    "0 & \n",
    "0 \\\\\n",
    "0 & \n",
    "0 & \n",
    "0 \n",
    "\\end{bmatrix} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\sigma(XW_{32}) = \\sigma(x_{31} * w_{12} + x_{32} * w_{22} + x_{33} * w_{32}) $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\sigma(XW_{32})}{\\partial X} = \\begin{bmatrix}\n",
    "0 & \n",
    "0 & \n",
    "0 \\\\\n",
    "0 &\n",
    "0 & \n",
    "0 \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{32}) * w_{12} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{32}) * w_{22} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{32}) * w_{32} \n",
    "\\end{bmatrix} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Derivative calculation for matrix multiplication example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This calculation is in the appendix of the book; it may be easier to follow here than it is to follow there."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\Lambda}{\\partial X}(S) = \n",
    "\\frac{\\partial \\sigma(XW_{11})}{\\partial X} + \n",
    "\\frac{\\partial \\sigma(XW_{12})}{\\partial X} + \n",
    "\\frac{\\partial \\sigma(XW_{21})}{\\partial X} + \n",
    "\\frac{\\partial \\sigma(XW_{22})}{\\partial X} + \n",
    "\\frac{\\partial \\sigma(XW_{31})}{\\partial X} + \n",
    "\\frac{\\partial \\sigma(XW_{32})}{\\partial X} = \n",
    "\\begin{bmatrix}\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) * w_{11} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) * w_{21} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) * w_{31} \\\\\n",
    "0 &\n",
    "0 & \n",
    "0 \\\\\n",
    "0 & \n",
    "0 & \n",
    "0 \\end{bmatrix} +\n",
    "\\begin{bmatrix}\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{12}) * w_{12} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{12}) * w_{22} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{12}) * w_{32} \\\\\n",
    "0 &\n",
    "0 & \n",
    "0 \\\\\n",
    "0 & \n",
    "0 & \n",
    "0 \\end{bmatrix} + \n",
    "\\begin{bmatrix}\n",
    "0 & \n",
    "0 & \n",
    "0 \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{21}) * w_{11} &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{21}) * w_{21} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{21}) * w_{31} \\\\\n",
    "0 & \n",
    "0 & \n",
    "0 \\end{bmatrix} + \n",
    "\\begin{bmatrix}\n",
    "0 & \n",
    "0 & \n",
    "0 \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{22}) * w_{12} &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{22}) * w_{22} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{22}) * w_{32} \\\\\n",
    "0 & \n",
    "0 & \n",
    "0 \\end{bmatrix} +\n",
    "\\begin{bmatrix}\n",
    "0 & \n",
    "0 &\n",
    "0 \\\\\n",
    "0 &\n",
    "0 & \n",
    "0 \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{31}) * w_{11} &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{31}) * w_{21} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{31}) * w_{31} \\end{bmatrix} +\n",
    "\\begin{bmatrix}\n",
    "0 &\n",
    "0 &\n",
    "0 \\\\\n",
    "0 &\n",
    "0 & \n",
    "0 \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{32}) * w_{12} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{32}) * w_{22} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{32}) * w_{32} \\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\Lambda}{\\partial X}(S) = \n",
    "\\begin{bmatrix}\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) * w_{11} + \\frac{\\partial \\sigma}{\\partial u}(XW_{12}) * w_{12} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) * w_{21} + \\frac{\\partial \\sigma}{\\partial u}(XW_{12}) * w_{22} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) * w_{31} + \\frac{\\partial \\sigma}{\\partial u}(XW_{12}) * w_{32} \\\\ \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{21}) * w_{11} + \\frac{\\partial \\sigma}{\\partial u}(XW_{22}) * w_{12} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{21}) * w_{21} + \\frac{\\partial \\sigma}{\\partial u}(XW_{22}) * w_{22} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{21}) * w_{31} + \\frac{\\partial \\sigma}{\\partial u}(XW_{22}) * w_{32} \\\\ \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{31}) * w_{11} + \\frac{\\partial \\sigma}{\\partial u}(XW_{32}) * w_{12} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{31}) * w_{21} + \\frac{\\partial \\sigma}{\\partial u}(XW_{32}) * w_{22} & \n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{31}) * w_{31} + \\frac{\\partial \\sigma}{\\partial u}(XW_{32}) * w_{32} \\end{bmatrix} \n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ W = \\begin{bmatrix}\n",
    "w_{11} & w_{12} \\\\\n",
    "w_{21} & w_{22} \\\\\n",
    "w_{31} & w_{32} \\end{bmatrix} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\Lambda}{\\partial X}(X) = \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(S) = \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(S) * W^T\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$  \n",
    "\\frac{\\partial \\Lambda}{\\partial u}(S) = \n",
    "\\begin{bmatrix}\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{12}) \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{21}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{22}) \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{31}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{32})\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\Lambda}{\\partial X}(X) = \n",
    "\\begin{bmatrix}\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{11}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{12}) \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{21}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{22}) \\\\\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{31}) &\n",
    "\\frac{\\partial \\sigma}{\\partial u}(XW_{32})\n",
    "\\end{bmatrix} * \n",
    "\\begin{bmatrix}\n",
    "w_{11} & w_{21} & w_{31} \\\\\n",
    "w_{12} & w_{22} & w_{32} \\\\\n",
    "\\end{bmatrix} = \\frac{\\partial \\Lambda}{\\partial u}(S) * W^T\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Like meat off the bone!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\frac{\\partial \\sigma}{\\partial X}(X, W) = \\frac{\\partial \\sigma}{\\partial u}(N) * W^T $$ \n",
    "\n",
    "$$ \\frac{\\partial \\sigma}{\\partial W}(X, W) = X^T * \\frac{\\partial \\sigma}{\\partial u}(N) $$ "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
